-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[Bugfix][Elasticsearch] jobmanager memory leak (#9160) #9161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
seatunnel-connectors-v2/connector-elasticsearch/src/main/java/org/apache/seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceSplitEnumerator.java:211
- Consider verifying that the hosts list is not empty before calling hosts.get(0) to avoid a potential IndexOutOfBoundsException.
String endpoint = String.format("%s/_cat/indices/%s?h=index,docsCount&format=json", hosts.get(0), index);
.../seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceSplitEnumerator.java
Show resolved
Hide resolved
public void open() { | ||
esRestClient = EsRestClient.createInstance(connConfig); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we can not create client in open
method? Could you share more details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After testing, I found that it is related to the CloseableHttpAsyncClient client in RestClient because this can cause the issue where although the file is deleted, the thread still holds an open handle to the temporary file.
in jobmanager
ls -l /proc/1/fd/ | grep deleted
lr-x------ 1 flink flink 64 Apr 14 05:44 381 -> /tmp/jm_00b72ebe540b0440ed4510730bb35a8b/blobStorage/job_fae4e51b34120030fbda6a238628bd3a/blob_p-3cfee1e57d60be6705eb5ca6458a32f526d30a3a-9b7914da78fe1a3c0389cdbbf9d1e481 (deleted)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean CloseableHttpAsyncClient
has bug so that not close properly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a factual perspective, yes. After replacing it with CloseableHttpClient, the #9160 was resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should raise a issue on ElasticSearch community or upgrade ElasticSearch dependency version to verify it fixed or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Hisoka-X Can it be merged now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i found the issue about this bug HttpAsyncClient 5.0 file descriptor leak
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the last elasticsearch-rest-client version,the httpasyncclient version is 4.1.5
org.elasticsearch.client elasticsearch-rest-client 8.17.4There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
seatunnel-connectors-v2/connector-elasticsearch/src/main/java/org/apache/seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceSplitEnumerator.java:90
- [nitpick] The open() (and close()) methods are now empty following the removal of esRestClient. Consider adding a clarifying comment in these methods to indicate that they are intentionally no-ops after the refactor.
public void open() {}
.../seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceSplitEnumerator.java
Outdated
Show resolved
Hide resolved
…org/apache/seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceSplitEnumerator.java fix statusCode!=SC_OK Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request fixes a memory leak issue in the Elasticsearch connector by refactoring the usage of HTTP clients and removing the persistent EsRestClient object in the source split enumerator.
- Removed the EsRestClient usage and now creates a new HTTP client per request via HttpClientUtil.
- Updated open() and close() methods to be empty and replaced them with a dedicated getIndexDocsCount() method for fetching data.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
seatunnel-connectors-v2/connector-elasticsearch/src/main/java/org/apache/seatunnel/connectors/seatunnel/elasticsearch/util/HttpClientUtil.java | Introduces a helper for creating SSL-configured HTTP clients. |
seatunnel-connectors-v2/connector-elasticsearch/src/main/java/org/apache/seatunnel/connectors/seatunnel/elasticsearch/source/ElasticsearchSourceSplitEnumerator.java | Removes the EsRestClient instance and adds a getIndexDocsCount() method for index data retrieval. |
Purpose of this pull request
fix #9160
Does this PR introduce any user-facing change?
How was this patch tested?
The issue can only be reproduced in a Flink on K8s deployment environment, specifically in session mode with HA (High Availability) configuration. It is difficult to test locally.
Check list
New License Guide
release-note
.